Overview

Dataset statistics

Number of variables22
Number of observations89
Missing cells285
Missing cells (%)14.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.4 KiB
Average record size in memory177.4 B

Variable types

NUM13
CAT7
UNSUPPORTED1
BOOL1

Warnings

lpep_pickup_datetime has constant value "89" Constant
level_1 has a high cardinality: 88 distinct values High cardinality
VendorID has a high cardinality: 88 distinct values High cardinality
MTA_tax is highly correlated with Pickup_longitudeHigh correlation
Pickup_longitude is highly correlated with MTA_taxHigh correlation
Tip_amount is highly correlated with Pickup_latitude and 1 other fieldsHigh correlation
Pickup_latitude is highly correlated with Tip_amountHigh correlation
Dropoff_longitude is highly correlated with Tip_amountHigh correlation
Payment_type is highly correlated with Lpep_dropoff_datetime and 2 other fieldsHigh correlation
Lpep_dropoff_datetime is highly correlated with Payment_typeHigh correlation
Fare_amount is highly correlated with Payment_typeHigh correlation
Tolls_amount is highly correlated with Payment_typeHigh correlation
VendorID is highly correlated with level_1High correlation
level_1 is highly correlated with VendorIDHigh correlation
Payment_type is highly correlated with Lpep_dropoff_datetimeHigh correlation
Lpep_dropoff_datetime is highly correlated with Payment_typeHigh correlation
Extra has 66 (74.2%) missing values Missing
Tip_amount has 23 (25.8%) missing values Missing
Total_amount has 36 (40.4%) missing values Missing
Payment_type has 71 (79.8%) missing values Missing
Trip_type has 89 (100.0%) missing values Missing
level_1 is uniformly distributed Uniform
VendorID is uniformly distributed Uniform
Trip_type is an unsupported type, check if it needs cleaning or further analysis Unsupported
Store_and_fwd_flag has 4 (4.5%) zeros Zeros
RateCodeID has 4 (4.5%) zeros Zeros
Pickup_longitude has 1 (1.1%) zeros Zeros
Pickup_latitude has 14 (15.7%) zeros Zeros
Dropoff_latitude has 13 (14.6%) zeros Zeros
Passenger_count has 3 (3.4%) zeros Zeros
Trip_distance has 47 (52.8%) zeros Zeros
Extra has 15 (16.9%) zeros Zeros
Tolls_amount has 5 (5.6%) zeros Zeros
Total_amount has 27 (30.3%) zeros Zeros

Reproduction

Analysis started2020-12-31 01:52:16.830065
Analysis finished2020-12-31 01:52:36.164247
Duration19.33 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

level_0
Categorical

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size712.0 B
2
84 
1
 
5
ValueCountFrequency (%) 
28494.4%
 
155.6%
 
2020-12-30T20:52:36.259033image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-30T20:52:36.341820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:36.418615image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

level_1
Categorical

HIGH CARDINALITY
HIGH CORRELATION
UNIFORM

Distinct88
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size712.0 B
2015-11-01 00:57:34
 
2
2015-08-01 00:01:57
 
1
2019-09-01 00:10:53
 
1
2019-10-01 00:26:02
 
1
2015-07-01 00:01:10
 
1
Other values (83)
83 
ValueCountFrequency (%) 
2015-11-01 00:57:3422.2%
 
2015-08-01 00:01:5711.1%
 
2019-09-01 00:10:5311.1%
 
2019-10-01 00:26:0211.1%
 
2015-07-01 00:01:1011.1%
 
2019-08-01 00:22:1211.1%
 
2018-12-01 00:23:2511.1%
 
2020-02-01 00:16:5911.1%
 
2013-10-01 00:00:0011.1%
 
2020-06-01 00:09:0511.1%
 
Other values (78)7887.6%
 
2020-12-30T20:52:36.541286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique87 ?
Unique (%)97.8%
2020-12-30T20:52:36.659970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19

VendorID
Categorical

HIGH CARDINALITY
HIGH CORRELATION
UNIFORM

Distinct88
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size712.0 B
2015-11-01 23:57:45
 
2
2018-03-01 00:22:02
 
1
2020-05-01 00:44:00
 
1
2015-10-01 00:56:03
 
1
2016-02-01 00:10:06
 
1
Other values (83)
83 
ValueCountFrequency (%) 
2015-11-01 23:57:4522.2%
 
2018-03-01 00:22:0211.1%
 
2020-05-01 00:44:0011.1%
 
2015-10-01 00:56:0311.1%
 
2016-02-01 00:10:0611.1%
 
2019-07-01 00:29:5011.1%
 
2017-12-01 00:40:2211.1%
 
2016-01-01 00:39:3611.1%
 
2017-06-01 01:39:5211.1%
 
2016-11-01 00:50:4811.1%
 
Other values (78)7887.6%
 
2020-12-30T20:52:36.784624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique87 ?
Unique (%)97.8%
2020-12-30T20:52:36.903324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19

lpep_pickup_datetime
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size712.0 B
N
89 
ValueCountFrequency (%) 
N89100.0%
 
2020-12-30T20:52:37.001050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-30T20:52:37.068870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:37.132700image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Lpep_dropoff_datetime
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size712.0 B
1
80 
5
 
7
4
 
1
2
 
1
ValueCountFrequency (%) 
18089.9%
 
577.9%
 
411.1%
 
211.1%
 
2020-12-30T20:52:37.243404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)2.2%
2020-12-30T20:52:37.778966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:37.903630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Store_and_fwd_flag
Real number (ℝ)

ZEROS

Distinct50
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.06999729
Minimum-74.01079559
Maximum264
Zeros4
Zeros (%)4.5%
Memory size712.0 B
2020-12-30T20:52:38.012348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-74.01079559
5-th percentile-73.96972504
Q10
median74
Q3145
95-th percentile260.8
Maximum264
Range338.0107956
Interquartile range (IQR)145

Descriptive statistics

Standard deviation108.5105896
Coefficient of variation (CV)1.445458818
Kurtosis-0.8953076846
Mean75.06999729
Median Absolute Deviation (MAD)71
Skewness0.2284766632
Sum6681.229759
Variance11774.54805
MonotocityNot monotonic
2020-12-30T20:52:38.136014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
26455.6%
 
6655.6%
 
25544.5%
 
7444.5%
 
9244.5%
 
19344.5%
 
9744.5%
 
044.5%
 
4233.4%
 
8033.4%
 
Other values (40)4955.1%
 
ValueCountFrequency (%) 
-74.0107955911.1%
 
-73.9911804211.1%
 
-73.9876403811.1%
 
-73.9873352111.1%
 
-73.9794845611.1%
 
ValueCountFrequency (%) 
26455.6%
 
25611.1%
 
25544.5%
 
24422.2%
 
23611.1%
 

RateCodeID
Real number (ℝ≥0)

ZEROS

Distinct62
Distinct (%)69.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean109.573833
Minimum0
Maximum265
Zeros4
Zeros (%)4.5%
Memory size712.0 B
2020-12-30T20:52:38.276647image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9.4
Q140.72150803
median74
Q3192
95-th percentile263
Maximum265
Range265
Interquartile range (IQR)151.278492

Descriptive statistics

Standard deviation85.47290806
Coefficient of variation (CV)0.7800485364
Kurtosis-1.164983209
Mean109.573833
Median Absolute Deviation (MAD)55
Skewness0.5436935547
Sum9752.071136
Variance7305.618012
MonotocityNot monotonic
2020-12-30T20:52:38.409294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
19366.7%
 
044.5%
 
4144.5%
 
26433.4%
 
25633.4%
 
4933.4%
 
14522.2%
 
12922.2%
 
9722.2%
 
1422.2%
 
Other values (52)5865.2%
 
ValueCountFrequency (%) 
044.5%
 
711.1%
 
1311.1%
 
1422.2%
 
1711.1%
 
ValueCountFrequency (%) 
26511.1%
 
26433.4%
 
26322.2%
 
25633.4%
 
24711.1%
 

Pickup_longitude
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct25
Distinct (%)28.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-16.92781616
Minimum-74.01078033
Maximum5
Zeros1
Zeros (%)1.1%
Memory size712.0 B
2020-12-30T20:52:38.527974image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-74.01078033
5-th percentile-73.98213654
Q10
median1
Q31
95-th percentile5
Maximum5
Range79.01078033
Interquartile range (IQR)1

Descriptive statistics

Standard deviation32.87896324
Coefficient of variation (CV)-1.942303894
Kurtosis-0.5968203748
Mean-16.92781616
Median Absolute Deviation (MAD)0
Skewness-1.186332462
Sum-1506.575638
Variance1081.026224
MonotocityNot monotonic
2020-12-30T20:52:38.632696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%) 
14853.9%
 
51213.5%
 
266.7%
 
-73.9589843822.2%
 
-73.8649902311.1%
 
-74.0000534111.1%
 
-73.984115611.1%
 
-73.9242782611.1%
 
-73.9725341811.1%
 
-73.9572601311.1%
 
Other values (15)1516.9%
 
ValueCountFrequency (%) 
-74.0107803311.1%
 
-74.0000534111.1%
 
-73.9873275811.1%
 
-73.984115611.1%
 
-73.9839401211.1%
 
ValueCountFrequency (%) 
51213.5%
 
266.7%
 
14853.9%
 
011.1%
 
-73.8363342311.1%
 

Pickup_latitude
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct72
Distinct (%)80.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.7542269
Minimum0
Maximum90.41
Zeros14
Zeros (%)15.7%
Memory size712.0 B
2020-12-30T20:52:38.764342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.76
median2.8
Q340.66344833
95-th percentile40.81643906
Maximum90.41
Range90.41
Interquartile range (IQR)39.90344833

Descriptive statistics

Standard deviation18.7378246
Coefficient of variation (CV)1.469146248
Kurtosis1.915168236
Mean12.7542269
Median Absolute Deviation (MAD)2.67
Skewness1.527077381
Sum1135.126194
Variance351.1060707
MonotocityNot monotonic
2020-12-30T20:52:38.892006image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01415.7%
 
0.7622.2%
 
0.7222.2%
 
1.522.2%
 
40.6634483322.2%
 
40.7858848611.1%
 
40.7290458711.1%
 
40.7603797911.1%
 
40.6980438211.1%
 
40.7606468211.1%
 
Other values (62)6269.7%
 
ValueCountFrequency (%) 
01415.7%
 
0.2411.1%
 
0.5111.1%
 
0.5811.1%
 
0.6711.1%
 
ValueCountFrequency (%) 
90.4111.1%
 
40.9122123711.1%
 
40.8450889611.1%
 
40.826099411.1%
 
40.818988811.1%
 

Dropoff_longitude
Real number (ℝ≥0)

HIGH CORRELATION

Distinct38
Distinct (%)42.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.67640449
Minimum1
Maximum404.5
Zeros0
Zeros (%)0.0%
Memory size712.0 B
2020-12-30T20:52:39.020659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12.5
median6
Q312.5
95-th percentile25.5
Maximum404.5
Range403.5
Interquartile range (IQR)10

Descriptive statistics

Standard deviation42.64503601
Coefficient of variation (CV)3.364127109
Kurtosis83.59811148
Mean12.67640449
Median Absolute Deviation (MAD)5
Skewness9.011845796
Sum1128.2
Variance1818.599096
MonotocityNot monotonic
2020-12-30T20:52:39.142733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%) 
11921.3%
 
566.7%
 
2.555.6%
 
655.6%
 
3.533.4%
 
2033.4%
 
13.533.4%
 
333.4%
 
933.4%
 
1133.4%
 
Other values (28)3640.4%
 
ValueCountFrequency (%) 
11921.3%
 
1.511.1%
 
211.1%
 
2.555.6%
 
333.4%
 
ValueCountFrequency (%) 
404.511.1%
 
28.211.1%
 
2811.1%
 
27.511.1%
 
26.511.1%
 

Dropoff_latitude
Real number (ℝ≥0)

ZEROS

Distinct20
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9204494382
Minimum0
Maximum14.35
Zeros13
Zeros (%)14.6%
Memory size712.0 B
2020-12-30T20:52:39.266400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.5
median0.5
Q30.5
95-th percentile3.828
Maximum14.35
Range14.35
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.814723815
Coefficient of variation (CV)1.971562738
Kurtosis34.96111593
Mean0.9204494382
Median Absolute Deviation (MAD)0
Skewness5.323121676
Sum81.92
Variance3.293222523
MonotocityNot monotonic
2020-12-30T20:52:39.368129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%) 
0.55764.0%
 
01314.6%
 
0.0922.2%
 
0.211.1%
 
411.1%
 
3.2511.1%
 
5.2411.1%
 
6.1111.1%
 
3.5711.1%
 
0.1911.1%
 
Other values (10)1011.2%
 
ValueCountFrequency (%) 
01314.6%
 
0.0922.2%
 
0.1911.1%
 
0.211.1%
 
0.4911.1%
 
ValueCountFrequency (%) 
14.3511.1%
 
6.1111.1%
 
5.2411.1%
 
4.3311.1%
 
411.1%
 

Passenger_count
Real number (ℝ)

ZEROS

Distinct23
Distinct (%)25.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.478089888
Minimum-15
Maximum52
Zeros3
Zeros (%)3.4%
Memory size712.0 B
2020-12-30T20:52:39.481835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-15
5-th percentile0.5
Q10.5
median0.5
Q30.5
95-th percentile20.55
Maximum52
Range67
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.69723296
Coefficient of variation (CV)2.38879371
Kurtosis10.48827477
Mean4.478089888
Median Absolute Deviation (MAD)0
Skewness3.111339855
Sum398.55
Variance114.4307929
MonotocityNot monotonic
2020-12-30T20:52:39.579725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%) 
0.56370.8%
 
033.4%
 
3.522.2%
 
12.522.2%
 
7.811.1%
 
4511.1%
 
50.511.1%
 
4211.1%
 
1311.1%
 
15.511.1%
 
Other values (13)1314.6%
 
ValueCountFrequency (%) 
-1511.1%
 
033.4%
 
0.56370.8%
 
3.522.2%
 
511.1%
 
ValueCountFrequency (%) 
5211.1%
 
50.511.1%
 
4511.1%
 
4211.1%
 
21.2511.1%
 

Trip_distance
Real number (ℝ≥0)

ZEROS

Distinct27
Distinct (%)30.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9943820225
Minimum0
Maximum7.64
Zeros47
Zeros (%)52.8%
Memory size712.0 B
2020-12-30T20:52:39.696414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31.26
95-th percentile4.44
Maximum7.64
Range7.64
Interquartile range (IQR)1.26

Descriptive statistics

Standard deviation1.649629404
Coefficient of variation (CV)1.658949344
Kurtosis3.4868427
Mean0.9943820225
Median Absolute Deviation (MAD)0
Skewness1.964365544
Sum88.5
Variance2.721277171
MonotocityNot monotonic
2020-12-30T20:52:39.807115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%) 
04752.8%
 
0.51314.6%
 
144.5%
 
4.4422.2%
 
2.0411.1%
 
2.511.1%
 
311.1%
 
2.5611.1%
 
5.7611.1%
 
3.1111.1%
 
Other values (17)1719.1%
 
ValueCountFrequency (%) 
04752.8%
 
0.0111.1%
 
0.51314.6%
 
0.711.1%
 
144.5%
 
ValueCountFrequency (%) 
7.6411.1%
 
6.0311.1%
 
5.7611.1%
 
4.7611.1%
 
4.4422.2%
 

Fare_amount
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size712.0 B
0
69 
0.5
17 
6.12
 
2
5.76
 
1
ValueCountFrequency (%) 
06977.5%
 
0.51719.1%
 
6.1222.2%
 
5.7611.1%
 
2020-12-30T20:52:39.944747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)1.1%
2020-12-30T20:52:40.026531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:40.136123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length3
Mean length3.033707865
Min length3

Extra
Real number (ℝ≥0)

MISSING
ZEROS

Distinct9
Distinct (%)39.1%
Missing66
Missing (%)74.2%
Infinite0
Infinite (%)0.0%
Mean1.218695652
Minimum0
Maximum10.3
Zeros15
Zeros (%)16.9%
Memory size712.0 B
2020-12-30T20:52:40.231867image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31.71
95-th percentile5.902
Maximum10.3
Range10.3
Interquartile range (IQR)1.71

Descriptive statistics

Standard deviation2.471018017
Coefficient of variation (CV)2.027592379
Kurtosis8.525754497
Mean1.218695652
Median Absolute Deviation (MAD)0
Skewness2.817782614
Sum28.03
Variance6.10593004
MonotocityNot monotonic
2020-12-30T20:52:40.331601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
01516.9%
 
10.311.1%
 
1.2611.1%
 
2.8611.1%
 
1.9511.1%
 
1.5611.1%
 
6.2411.1%
 
1.8611.1%
 
211.1%
 
(Missing)6674.2%
 
ValueCountFrequency (%) 
01516.9%
 
1.2611.1%
 
1.5611.1%
 
1.8611.1%
 
1.9511.1%
 
ValueCountFrequency (%) 
10.311.1%
 
6.2411.1%
 
2.8611.1%
 
211.1%
 
1.9511.1%
 

MTA_tax
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size712.0 B
0.3
64 
0
25 
ValueCountFrequency (%) 
0.36471.9%
 
02528.1%
 
2020-12-30T20:52:40.408395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Tip_amount
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct52
Distinct (%)78.8%
Missing23
Missing (%)25.8%
Infinite0
Infinite (%)0.0%
Mean19.6219697
Minimum1.7
Maximum413.51
Zeros0
Zeros (%)0.0%
Memory size712.0 B
2020-12-30T20:52:40.501147image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1.7
5-th percentile3.8
Q17.3
median10.13
Q319.24
95-th percentile34.4625
Maximum413.51
Range411.81
Interquartile range (IQR)11.94

Descriptive statistics

Standard deviation50.05578864
Coefficient of variation (CV)2.551007336
Kurtosis61.55362179
Mean19.6219697
Median Absolute Deviation (MAD)5.285
Skewness7.722510339
Sum1295.05
Variance2505.581976
MonotocityNot monotonic
2020-12-30T20:52:40.624816image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7.344.5%
 
3.844.5%
 
7.833.4%
 
4.333.4%
 
19.2422.2%
 
13.322.2%
 
12.322.2%
 
5.822.2%
 
14.311.1%
 
6.2411.1%
 
Other values (42)4247.2%
 
(Missing)2325.8%
 
ValueCountFrequency (%) 
1.711.1%
 
3.311.1%
 
3.844.5%
 
4.333.4%
 
4.811.1%
 
ValueCountFrequency (%) 
413.5111.1%
 
38.1911.1%
 
36.211.1%
 
34.5611.1%
 
34.1711.1%
 

Tolls_amount
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct9
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.701685393
Minimum0
Maximum61.8
Zeros5
Zeros (%)5.6%
Memory size712.0 B
2020-12-30T20:52:40.733526image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.12
Q11
median1
Q32
95-th percentile3.2
Maximum61.8
Range61.8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation8.109760006
Coefficient of variation (CV)3.001741071
Kurtosis38.64100191
Mean2.701685393
Median Absolute Deviation (MAD)0.7
Skewness6.032733735
Sum240.45
Variance65.76820735
MonotocityNot monotonic
2020-12-30T20:52:40.828273image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
13943.8%
 
22730.3%
 
0.31314.6%
 
055.6%
 
61.811.1%
 
1411.1%
 
411.1%
 
42.511.1%
 
21.2511.1%
 
ValueCountFrequency (%) 
055.6%
 
0.31314.6%
 
13943.8%
 
22730.3%
 
411.1%
 
ValueCountFrequency (%) 
61.811.1%
 
42.511.1%
 
21.2511.1%
 
1411.1%
 
411.1%
 

Ehail_fee
Real number (ℝ)

Distinct20
Distinct (%)22.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.91494382
Minimum-15
Maximum52.8
Zeros0
Zeros (%)0.0%
Memory size712.0 B
2020-12-30T20:52:40.950944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-15
5-th percentile1
Q11
median1
Q32
95-th percentile17.016
Maximum52.8
Range67.8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation8.67607383
Coefficient of variation (CV)2.216142614
Kurtosis16.51451812
Mean3.91494382
Median Absolute Deviation (MAD)0
Skewness3.608597201
Sum348.43
Variance75.2742571
MonotocityNot monotonic
2020-12-30T20:52:41.051675image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%) 
16573.0%
 
266.7%
 
8.811.1%
 
711.1%
 
9.7511.1%
 
4511.1%
 
1511.1%
 
-1511.1%
 
11.811.1%
 
9.3611.1%
 
Other values (10)1011.2%
 
ValueCountFrequency (%) 
-1511.1%
 
16573.0%
 
266.7%
 
4.811.1%
 
711.1%
 
ValueCountFrequency (%) 
52.811.1%
 
4511.1%
 
27.0411.1%
 
19.311.1%
 
17.1611.1%
 

Total_amount
Real number (ℝ≥0)

MISSING
ZEROS

Distinct5
Distinct (%)9.4%
Missing36
Missing (%)40.4%
Infinite0
Infinite (%)0.0%
Mean0.8726415094
Minimum0
Maximum3
Zeros27
Zeros (%)30.3%
Memory size712.0 B
2020-12-30T20:52:41.165371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile2.75
Maximum3
Range3
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.05226706
Coefficient of variation (CV)1.205841171
Kurtosis-0.843539169
Mean0.8726415094
Median Absolute Deviation (MAD)0
Skewness0.7997158501
Sum46.25
Variance1.107265965
MonotocityNot monotonic
2020-12-30T20:52:41.258123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
02730.3%
 
11213.5%
 
2.7577.9%
 
266.7%
 
311.1%
 
(Missing)3640.4%
 
ValueCountFrequency (%) 
02730.3%
 
11213.5%
 
266.7%
 
2.7577.9%
 
311.1%
 
ValueCountFrequency (%) 
311.1%
 
2.7577.9%
 
266.7%
 
11213.5%
 
02730.3%
 

Payment_type
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)11.1%
Missing71
Missing (%)79.8%
Memory size712.0 B
1
13 
2
ValueCountFrequency (%) 
11314.6%
 
255.6%
 
(Missing)7179.8%
 
2020-12-30T20:52:41.381950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-30T20:52:41.460740image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:41.538532image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Trip_type
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing89
Missing (%)100.0%
Memory size840.0 B

Interactions

2020-12-30T20:52:17.679369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:17.791080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:17.918730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.027453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.142133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.546088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.671717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.782421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:18.903080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.016815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.128478image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.243213image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.355894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.457628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.552390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.647136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.763777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:19.905400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.051038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.202606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.322823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.422595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.514352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.607109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.699861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.792617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.891357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:20.986096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.082829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.178574image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.271337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.366081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.461825image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.555573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.652345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.745097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.838851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:21.933594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.026347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.120095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.214843image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.309589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.403345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.497088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.591833image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.688567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.782326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.881060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:22.974809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.068386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.163133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.256873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.350623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.444376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.537126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.631879image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.726627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.821376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:23.914124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.007875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.101998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.195758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.290498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.385064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.477812image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.572560image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.668304image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.761055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.854807image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:24.958481image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.060210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.183878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.287640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.397209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.491154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.583960image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.679749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.779648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.890386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:25.987719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.081429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.179207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.278943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.378672image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.478410image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.581095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.691839image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.793525image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:26.889317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.344851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.465535image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.559279image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.653036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.747753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.841536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:27.945247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.041999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.135745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.231487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.325244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.417990image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.511741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.604881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.697623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.792370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.885122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:28.976878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.071442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.164154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.259898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.352734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.445492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.538248image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.629999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.722756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.815875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.907632image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:29.999386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.099909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.195708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.292434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.386183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.477939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.570689image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.662445image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.756195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.851944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:30.946691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.040276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.135015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.228765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.322517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.415317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.510057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.604585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.699332image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.793080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.886835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:31.979584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.071348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.165422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.258174image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.354922image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.447677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.541813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.634565image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.727321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.839427image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:32.935186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.031877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.127096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.220844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.312598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.405352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.498106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.591851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.684604image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.779350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.872103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:33.965162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.060906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.159596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.262321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.364055image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.464786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.565516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.667244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.765020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:34.865713image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-30T20:52:41.662203image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-30T20:52:42.005285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-30T20:52:42.328939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-30T20:52:42.686984image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-30T20:52:42.985185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-30T20:52:35.092146image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:35.595807image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:35.826188image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-30T20:52:35.985774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

level_0level_1VendorIDlpep_pickup_datetimeLpep_dropoff_datetimeStore_and_fwd_flagRateCodeIDPickup_longitudePickup_latitudeDropoff_longitudeDropoff_latitudePassenger_countTrip_distanceFare_amountExtraMTA_taxTip_amountTolls_amountEhail_feeTotal_amountPayment_typeTrip_type
022013-08-01 08:14:372013-08-01 09:09:06N10.0000000.0000000.0000000.0000001.00.0021.250.00.00.000.0NaN21.252.00NaNNaNNaN
122013-09-01 00:02:002013-09-01 00:54:51N1-73.95240840.810726-73.98394040.6762855.014.3550.500.50.510.300.0NaN61.801.00NaNNaNNaN
222013-10-01 00:00:002013-10-01 15:33:36N10.0000000.000000-73.90346540.8450891.00.1942.000.00.50.000.0NaN42.502.001.0NaNNaN
322013-11-01 00:00:002013-11-01 13:48:03N10.0000000.000000-73.93710340.7606472.00.003.500.00.50.000.0NaN4.002.00NaNNaNNaN
422013-12-01 00:00:002013-12-01 20:44:23N10.0000000.000000-73.95726040.7423551.04.0013.000.50.50.000.0NaN14.002.00NaNNaNNaN
522015-07-01 00:01:102015-07-01 00:19:04N1-73.94062840.715027-73.91231540.7603801.04.3315.500.50.50.000.0NaN0.3016.802.01.0NaN
622015-07-01 00:05:352015-07-01 00:17:42N1-73.95113440.804947-73.86721840.8189891.06.1118.000.50.50.000.0NaN0.3019.301.01.0NaN
722015-08-01 00:00:032015-08-01 00:00:07N5-73.86501340.826099-73.86499040.8260991.00.007.000.00.00.000.0NaN0.007.001.02.0NaN
822015-08-01 00:01:572015-08-01 00:02:00N2-73.98733540.692123-73.98732840.6921231.00.0052.000.00.50.000.0NaN0.3052.802.01.0NaN
922015-09-01 00:02:342015-09-01 00:02:38N5-73.97948540.684956-73.97943140.6850201.00.007.800.00.01.950.0NaN0.009.751.02.0NaN

Last rows

level_0level_1VendorIDlpep_pickup_datetimeLpep_dropoff_datetimeStore_and_fwd_flagRateCodeIDPickup_longitudePickup_latitudeDropoff_longitudeDropoff_latitudePassenger_countTrip_distanceFare_amountExtraMTA_taxTip_amountTolls_amountEhail_feeTotal_amountPayment_typeTrip_type
7922020-02-01 00:10:252020-02-01 00:14:34N174.041.01.00.764.50.50.50.000.0NaN0.35.802.01.00.00NaNNaN
8022020-02-01 00:16:592020-02-01 00:21:35N174.074.01.00.725.00.50.50.000.0NaN0.36.301.01.00.00NaNNaN
8122020-03-01 00:20:182020-03-01 00:45:29N141.013.01.08.2426.50.50.57.640.0NaN0.338.191.01.02.75NaNNaN
8222020-03-01 00:15:422020-03-01 00:44:36N1181.0107.01.04.8721.00.50.50.000.0NaN0.325.052.01.02.75NaNNaN
8322020-04-01 00:44:022020-04-01 00:52:23N142.041.01.01.688.00.50.50.000.0NaN0.39.301.01.00.00NaNNaN
8422020-04-01 00:24:392020-04-01 00:33:06N1244.0247.02.01.949.00.50.50.000.0NaN0.310.302.01.00.00NaNNaN
8522020-05-01 00:27:482020-05-01 00:32:47N174.042.01.01.506.50.50.50.000.0NaN0.37.801.01.00.00NaNNaN
8622020-05-01 00:39:132020-05-01 00:44:00N1244.0116.02.00.926.00.50.50.000.0NaN0.37.302.01.00.00NaNNaN
8712020-06-01 00:22:072020-06-01 00:39:03N1255.014.01.00.0028.20.00.50.000.0NaN0.329.001.01.00.00NaNNaN
8822020-06-01 00:09:052020-06-01 00:22:46N1166.0141.01.03.4313.00.50.53.410.0NaN0.320.461.01.02.75NaNNaN